Goto

Collaborating Authors

 index file


Understand The concept of Indexing in depth! - Analytics Vidhya

#artificialintelligence

Data Engineers and data scientists often have to deal with an enormous amount of data. Dealing with such data is not a straightforward task. To process this data as efficiently as possible, we need to have a clear understanding of how the data is organized. So before moving on to the main topic, let us build a basic ground first. Memory in a computer system is organized in a hierarchy (as shown in the diagram below).


Managing Data in Massive-Scale Vector Search Engine

#artificialintelligence

The search based on Raw Data File is brute-force search which compares the distances between query vectors and origin vectors, and computes the nearest k vectors. Search efficiency can be greatly increased if the search is based on Index File where vectors are indexed. Building index requires additional disk space and is usually time-consuming. So what are the differences between Raw Data Files and Index Files? To put it simple, Raw Data File records every single vector together with their unique ID while Index File records vector clustering results such as index type, cluster centroids, and vectors in each cluster.